Rapid antibiotic susceptibility testing and species identification for mixed samples

Antimicrobial resistance is an increasing problem on a global scale. Rapid antibiotic susceptibility testing (AST) is urgently needed in the clinic to enable personalized prescriptions in high-resistance environments and to limit the use of broad-spectrum drugs. Current rapid phenotypic AST methods do not include species identification (ID), leaving time-consuming plating or culturing as the only available option when ID is needed to make the sensitivity call. Here we describe a method to perform phenotypic AST at the single-cell level in a microfluidic chip that allows subsequent genotyping by in situ FISH. By stratifying the phenotypic AST response on the species of individual cells, it is possible to determine the susceptibility profile for each species in a mixed sample in 2 h. In this proof-of-principle study, we demonstrate the operation with four antibiotics and mixed samples with combinations of seven species.


U-net training:
A U-net that takes a phase-contrast image as input and predicts cell-vs-background maps 6 was shown to perform well on E. coli data in mother-machine channels. U-net requires a lot more labeled data than we have for mixed species images. To improve the loss values on our dataset we use a pre-training step on a larger dataset (2583 phase-contrast images and corresponding binary masks) containing only E. coli. Pre-training of U-net was done on images of size 320x320 pixels. The number of training examples used during one iteration of the training loop, i.e. batch size is 8. Each of these images were obtained by randomly cropping 320x320 pixels from the original images of large size (1024x800). These cropped images were scaled with a random ratio between 0.75 and 1.25, to cover various ranges of cell sizes and resized back to 320x320. Random noise, illumination adjustments, gaussian blurring, and rotation of the images were done on the training data pairs as described in 7 . Any operation that gave a non-binary value in the binary mask was thresholded using otsu thresholding. 20% percent of the training data was used for validation of the model. In addition to image augmentations, weight maps were generated using morphological operations to give more weight to the pixels separating cells in the loss function, as described in the original U-net paper 8 . SI fig 1d shows a sample input image, its corresponding binary mask and weight map used during the training process. The loss function used to train the net is dice-loss + weighted binary cross-entropy 9 . Adam optimizer 10 was used with a learning rate of 10 -4 and the U-net was trained for 10 epochs until the losses flattened out. The weights were stored and used as initialization of new networks later.
After the pre-training step, we initialize a new U-net with pre-trained weights and train the network on the same dataset used for training the omnipose network. Augmentation and training of the network was similar to the pre-training. A learning rate of 10 -5 was used and the network was trained until loss curves flatten out. As in the previous model we evaluate the network on both subset of data ie. mixed vs E. coli. SI fig 1e shows the comparison between different subsets. We can see that the U-net performance is as high as Omnipose for E. coli but a bit lower for the mixed species data. SI figure 2 shows the phase-contrast image and corresponding cell detection for each species loaded in separate chips. Omnipose network trained on our data was used on all images and the data has not been part of the training of the network.

Methods 2: Channel detection
Channel detection was done using the same U-net architecture as used for cell segmentation. The original U-net described in 8 has the number of feature channels varying from 64 at the input to 1024 before upsampling, doubling after every downsampling step. In this channel detection net, the number of the feature channels were reduced by a factor of 8 at all stages compared to the original U-net model. Training data was generated using image processing operations (histograms of intensity, peak finding on intensity histograms, binary morphological operations, etc). Dataset was manually curated to remove images whose channels couldn't be detected with histograms of intensity and peak finding operations. The same data augmentations and loss functions used for the cell segmentation model were used during training of the channel detection network (SI section 1). All pixels were given the same weight in the loss function. The training optimization is similar to that of the cell segmentation model training. Each phase-contrast image has a fixed number of channels in which cells grow. The mother machine channels were detected in each phase contrast image of the time-lapse microscopy data and location of each channel was mapped to its channel-number in each image. Using these locations time-series stacks of individual growth channel were constructed for further analysis. SI fig 3d shows the pixel offset distribution between channel locations detected using channel segmentation predictions and ground-truths of the test data not seen by the network.

Supplementary Figure 3: a) Sample training data for channel detection. Phase and binary mask example images. Scale bar 5 µm b) Training/validation loss curves c) Phase-contrast image and its corresponding channel segmentation mask. Scale bar, 20 µm. d) Pixel offset in channel locations between ground truths and predictions on test dataset (n=7760 channels total)
Methods 3: Processing FISH Images with single color per species Fluorescence images were acquired in 4 channels for the FISH-ID. Each FISH probe (Supplementary data 1) was designed to bind to a single species. Image processing operations like smoothing, histogram equalization was performed on the images and the background was subtracted using the empty channel values present in each image. Regions on each image containing signals above a threshold were detected and regions longer than 50 pixels were considered for species ID. The thresholds were set for each of the 4 fluorescence channel images individually during analysis to get the binary mask for each image. Thresholds were determined automatically using the Otsu method or set to 8000 (whichever is higher) after background subtraction.

Methods 4: Cell tracking
Cell tracking involves linking cells from one frame to the next while they are growing in the mothermachine channels. A cell from frame at time t was compared against all the cells in time frame t+1, predict the association between them into 3 categories: no-link, link of movement type, link of division type. The classification problem was formulated as an edge classification problem similar to the one described in 11 . The features of the cells (centroid, eccentricity, length, area, etc) were transformed using MLP (multi-layer perceptrons) and edge features were constructed using the features of the objects that are linked by it. Edge features were transformed again using an MLP. Loss function for training is a combination of Batch Triplet loss and affinity loss as in 11 but with a modification for predicting multiple types of associations. An overview of the network is shown in SI fig 5a. To obtain training data for cell-tracking, we built a simulator that simulates the growth of cells inside mother-machine channels and keeps track of movements and divisions. The simulation involves constructing binary masks of cells whose growth rate is sampled from a distribution at each time step. At each time step, we update the image of the cell as well as its position in the growth channel. The type of link between two cells of a frame is stored during the simulation and used as ground truth. Cells that reach the end of the channel are dropped off from further timepoints. The parameters used for simulation such as cell-size, division size, growth rate and shape are chosen to emulate the real data. We use 1000 stacks of channels, with each stack containing 100 frames and their corresponding links as training data. SI fig 5b shows a sample stack of 30 frames generated by the simulator with two shapes of cells representing rod shaped and coccoid shaped cells.
When running the tracker network on experimental data, a cell from frame at time t was compared to all the cells at time t+1 to obtain probabilities over the classes of edges. The class with the highest probability is used to describe the edge between cells that were compared. Links are set between two frames and tracks were constructed to generate cell lineages. Spurious links predicted by the networks (especially merge events due to segmentation errors) are cleaned during the track construction using IoU metric between the cells that are linked. Division events are also checked for IoU and area ratio between mother and daughter cells. Any link predictions that don't meet the IoU criteria are dropped from tracking.

Methods 5: Species assignment for tracks using FISH data
The tracks ending in the last frame before fixing the cells were labeled with their species names obtained from the FISH data. This labeling was done based on which bounding boxes of the fluorescent data the centroids of the last blob in the track fell in. These species labels were rolled back in time to t = 0, thus labeling all the tracks since the beginning of the experiment. SI fig 6 shows a few of these labeled tracks in different colors for different species. Fig 9h) growth channels present in all experiments. In a typical experiment mixed species samples are loaded in the chip and combinatorial FISH assay is performed to get images such as the one included in the main text (Fig 4b). As shown in SI fig 7, the adapter sequences and fluorescent probes are designed to bind to each species uniquely in two different fluorescent channels. To assign species labels to tracks a classification method that could classify the signal from the 4-dimensional fluorescence image stack is required. Training data for the classification method was created using a bounding box drawing tool LabelImg 12 . Data for each species was obtained by picking a few images from control experiments done with single species and experiments with 2-species loaded onto the microfluidic chip. All the data was pooled and balanced to obtain 30000 (4-dimensional) data points for each species. Data for the background class was sampled randomly from the background of all the images in the dataset. We used the same concentration of fluorescent probes and imaging conditions in all the experiments. Signals from regions on the chip representing autofluorescence of PDMS were subtracted from the training data. Dataset was balanced to get the same number of data points in all the classes. We split the data into a test (30%) and train set (70%) and trained a random forest classifier using sklearn. SI fig 8a shows the classification accuracy and SI fig 8b shows the confusion matrix on the test data. We use the classifier to classify each pixel into 8-classes (7 for individual species with a known FISH signature, 1 for the background). SI fig 8c shows pixelwise class assignment on fluorescence image stacks taken from each of the repeats of four experiments shown in main text Fig 4, i.e data from experiments not seen in training. We can see that the classification is in line with what is expected from each of these experiments i.e the classifier finds the species that were loaded in that particular experiment with very high accuracy. A small percentage of pixels can be misclassified, but the probability of finding a continuous celllike region of these misclassified pixels is very low (SI fig 8c).

Supplementary Figure 6: Tracks labeled with different colors for different species a) E. faecalis (cyan) and E. coli(blue) in the same channel b) P. aeruginosa (green) and track with undetermined species(magenta) c) K. pneumoniae(red). Scale bar, 5 µm for both panels. Time series stacks were constructed for n>2000 (SI
We use the growth channel location information of the last phase-contrast image before genotyping to extract fluorescence images slices corresponding to each growth channel. Bounding boxes over continuous regions on the image stack belonging to the same class of cells were calculated. Each bounding box should also be longer than 90 pixels, to count it as valid and use it for species assignment. The centroids of the tracks falling in these bounding boxes were labeled with the cell type of the predicted classes.

Methods 7: Growth Curve estimations
Growth rates were calculated by fitting exponential functions on the areas of cells in a single track on a sliding window of 5 frames, corresponding to 10 min. For each timepoint in the experiment, growth rates were collected from all the existing labeled tracks in that frame that existed for a 5 frame window. These growth rates were averaged to get the mean growth rate of that particular species. Growth rates were collected for both the reference and the antibiotic treated population. The normalized growth rate was calculated as the ratio of growth rates with respect to the treatment population. The plots in SI fig 9a, 9b, 9c, 9d show the normalized growth rates of all the four species used in an experiment along with their respective standard error of the mean.